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Abstract 

The Brazilian Mathematical Olympiads for Public Schools (OBMEP) is held every 
year since 2005. In the 2013 edition there were over 47,000 schools registered involving 
nearly 19.2 million students. The Brazilian public educational system is structured into 
three administrative levels: federal, state and municipal. Students participating in the 
OBMEP come from three educational levels, two in primary and one in secondary 
school. We aim at studying the performance of Brazilian public schools which have 
been taking part of the OBMEP from 2006 until 2013. We propose a standardization 
of the mean scores of schools per year and educational level which is modeled through 
a hierarchical dynamic beta regression model. Both the mean and precision of the 
beta distribution are modeled as a function of covariates whose effects evolve smoothly 
with time. Results show that, regardless of the educational level, federal schools have 
better performance than municipal or state schools. The mean performance of schools 
increases with the human development index (HDI) of the municipality the school is 
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located in. Moreover, the difference in mean performance between federal and state or 
municipal schools tends to increase with the HDI. Schools with higher proportion of 
boys tend to have better mean performance in the second and third educational levels 
of OBMEP. 

Key words : Bayesian inference; Educational data; Multilevel models; OBMEP. 


1 Introduction 

The International Mathematical Olympiad (IMO) is organized yearly (except for 1980) since 
1959, when it was first held in Romania. Participation in an IMO is by invitation only. 
Country’s contestants should be selected through that Country’s national Mathematical 
Olympiad or an equivalent selection programme. Brazil has been participating in the IMO 
since 1979, when it organized the first Brazilian Mathematical Olympiad (OBM). OBM is 
organized by the Brazilian Mathematical Society (SBM). The aims of OBM are to stimulate 
the study of mathematics by students, develop and improve the training of teachers, influence 
the improvement of education, in addition to discovering young talents. Brazil has took part 
in 35 editions of the IMO, since then it won 9 gold, 33 silver, 68 bronze medals, and 29 
honourable mentions. The position of Brazil in the unofficial mark ordering available from 
the IMO website (https://www.imo-official.org/results.aspx) shows that it has been 
performing reasonably well, especially when one considers that Brazil is a developing country 
that faces many challenges in its educational system. 

Among the 65 economies that participated in the 2012 edition of the Program for Inter¬ 
national Student Assessment (PISA), Brazil performed below the average in mathematics, 
occupying the 58 th position in the mathematics mean score. On the other hand, in this same 
edition of PISA, the Brazilian mean mathematics performance reached 391 score points, 
having increased 35 points compared to the 2003 edition. The 2012 PISA report points 
out that between 2003 and 2012, performance gains in Brazil are largely attributed to a 
reduction in the proportion of low-performing (those who perform below the baseline Level 
2) students. Indicators from 2012 of the Organisation for Economic Co-operation and De¬ 
velopment (OECD) show that Brazil boasts one of the largest increase in expenditure on 
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education between 2000 and 2009 among the countries for which they had data available. Al¬ 
though Brazil’s spending on education as a percentage of GDP is below the OECD average, 
there has been a steady increase in the percentage of GDP invested in education, particularly 
between 2000 and 2009. Brazil increased public spending on education from 10.5% of total 
public expenditure in 2000, to 14.5% in 2005, and to 16.8% in 2009 - one of the steepest 
rates of growth among the 33 countries for which data were available. Brazil ranks 4 th in this 
measure out of the 32 countries for which data are available and above the OECD average of 
13% (OECD, 2012). One of such investments in education is the Mathematical Olympiads 
for Public Schools (Olimpiada Brasileira de Matematica em Escolas Publicas or OBMEP). 


1.1 The Brazilian Mathematical Olympiads for Public Schools 

OBMEP has been promoted since 2005 by the Ministries of Science and Technology, and 
of Education, and organized by Instituto Nacional de Matematica Pura e Aplicada (IMPA). 
OBMEP has similar aims as the OBM but with exclusive focus on public schools, wherein the 
Brazilian educational system faces serious challenges. The organizers of the OBMEP promote 
different activities: the OBMEP Program in Schools, which is focused on the mathematics 
teachers by stimulating activities outside class hours; the Olympic Pole Intensive Training 
(POTI), which offers free math courses for students enrolled in the 8 th and 9 th grades of 
elementary school and in any year of high school interested in participating in the OBMEP 
and OBM; the Scientific Initiation Program (PIC-OBMEP) which aims at giving continuous 
support to OBMEP medalists through scholarships when they start studying in an university. 
These are some examples of initiatives which are related to the organization of the OBMEP 
and are expected to strengthen the teaching of mathematics in public schools, awaken in 
students of public schools an interest for mathematics, and science in general, and provide 
those students who are OBMEP medalists with the opportunity to attend an university and 
build a career. 

The OBMEP is held every year since 2005, when there were over 31,000 schools registered, 
comprising over 10.5 million students. In 2013 there were over 47,000 schools registered, 
involving nearly 19.2 million students, covering approximately 99,5% of the municipalities 
in Brazil. In 2013, OBMEP awarded 499 students with gold medals, 900 with silver, and 
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4,600 with bronze. 


The OBMEP is structured as follows. The educational school system in Brazil comprises 
12 years of basic education, the Erst 9 years comprise the primary school and the remaining 
3 are the secondary school. Compared to other countries, the Erst 5 years can be compared 
to primary school, the next four grades can be compared to a low secondary school, and the 


last three grades are the secondary school or high school (Biondi et al. 2012). 


The public Brazilian educational system comprises three diflerent types of administrative 
school levels: municipal, state, and federal. Any of these schools are allowed to subscribe to 
take part in the OBMEP. The registration is done by the schools, and each school indicates 
how many students will take part in the first phase of the OBMEP. The students are divided 
into three different levels: 

• Level 1: students in the 6 th and 7 th grades of the primary school; 

• Level 2: students in the 8 th and 9 th grades of the primary school; 


• Level 3: students in high school. 

The OBMEP is performed in two phases: first, students take a multiple choice exam with 
20 questions for each educational level. The correction of the first phase exams is done 
locally, that is, they are corrected by the school’s own teachers. Approximately 5% of 
students with the highest scores in each level of each school, are approved for the second 
phase of OBMEP. Students who scored zero are not qualified for the second phase, even 
if his/her school has not reached the proportion of students expected to be in the second 
phase. In the second phase, students write a discursive examination comprising six questions. 
These tests are also separated by level of education. The exams of the second phase are 
marked regionally by committees formed by the OBMEP organizing committee. Typically, 
committee members are mathematical researchers from universities in the region, who have 
experience with Mathematics Olympiads. For every edition, the various regional committees 
define a cutoff point of the note. The marks are reviewed by a national committee who 
establishes the prizes to be awarded. 

We aim at studying the performance of schools across Brazil that have taken part of the 
OBMEP from 2006 until 2013, the latest year that we have information available. Under- 
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standing what covariates influence the performance of schools in the OBMEP is important 
as it might help defining, or revising, strategies about teaching mathematics and attract 
more students to the area. 


This paper is organized as follows: next section describes the dataset available and how 
the sample to be analyzed was obtained. As we are comparing the performance of schools 
across different years, with different students being exposed to different exams, we propose 
a standardization of the school’s average scores, such that they lie in the interval (0,1). For 
this reason, Section |2.2 shows a brief literature review on beta regression analysis. And 
Section [3] proposes a hierarchical dynamic beta regression model to analyze the performance 
of schools across the OBMEP editions from 2006 until 2013. The inference procedure is also 
described in detail in this section. Then, Section [4] describes the model comparison criteria 
used to choose the best model among those fitted and discusses the results under this best 
model. Finally, Section [5] discusses our findings and describes some possible avenues of future 
research about the OBMEP. 


2 Dataset description 

We have information available from three different sources. The organizers of the OBMEP 
provided us with all information from all students who registered for the OBMEP from 2005 
until 2013. We decided for removing the year of 2005 from the analysis because there are no 
records on the gender of the students. This information started to be collected from 2006 on. 
Table [ljshows the number of schools that registered for each phase of the OBMEP from 2006 
until 2013. Although the number of students present in the second phase reduces consider¬ 
ably when compared to that of the first phase, the proportion of schools in both phases tend 
to be around 90% every year. For each year, we have information on the performance of each 
student within each school in both phases. We also have the name and the national code of 
the schools. These can be linked to the schools’ census data, which is collected nationwide, ev¬ 
ery year, by Instituto Nacional de Estudos e Pesquisas Educacionais Anisio Teixeira (INEP, 
http://portal. inep.gov.br/). The census data have information about local characteris¬ 
tics of the schools. Previous studies about the OBMEP have suggested that the performance 
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Year 

No. of Schools 

in Phase 1 

No. of Schools 

in Phases 1 and 2 

Percentage of schools 

in both phases 

2006 

32,603 

29,660 

91.0% 

2007 

37,886 

35,480 

93.6% 

2008 

40,396 

35,913 

88.9% 

2009 

43,851 

39,379 

89.8% 

2010 

44,718 

39,931 

89.3% 

2011 

44,684 

39,928 

89.4% 

2012 

46,722 

40,804 

87.3% 

2013 

47,145 

42,483 

90.1% 


Table 1: Distribution of the number of schools registered for each edition and phase of the 
OBMEP from 2006 until 2013. 

of students is strongly related to the geographical region the schools are located in. As in 
Brazil the geographical regions are strongly related to the human development index (HDI), 
we also obtained information about the HDI in 2010 of each Brazilian municipality present 
in the data. This is available from http://www.pnud.org.br/IDH/DH.aspx. 

This initial study focuses on the average scores of the schools that took part in the second 
phase of the OBMEP. To ease the computational burden of estimating our models we choose 
to analyze a sample from this population. Next we discuss how this sample was obtained. 

2.1 Sampling design 

The locations of the schools that take part in the OBMEP are divided between urban and 
rural areas. In 2013, 70.1% of the schools that participated in the OBMEP are located in 
urban areas, among these, 0.6% are federal, 42.8% are state and 26.6% are municipal. The 
remaining 0.1% are private schools that incorporate some students from the public system 
and offer a curriculum similar to the public one. These private schools are excluded from 
this study. Throughout the years the distribution of urban and rural schools taking part in 
the OBMEP follows similar patterns. As the rural schools involve too many particularities 
we opted to focus only on schools located in urban areas. 
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Our aim is to model the average score of the schools in the second phase of the OBMEP. 
In order to minimize the variance of the mean average scores we propose a stratified random 
sampling scheme (Thompson, 2012). The strata are defined by the following three auxiliary 
variables: the educational level (1, 2, and 3), the administrative level of the school (federal, 
state or municipal), and different levels of the HDI. The behavior of the HDI across Brazil is 
strongly related to the geographical region^] assuming high values in the south, and smaller 
values in the north and north-east regions of the country. We expect this will capture 
local characteristics of where the school is located in. We assume z = HDI e (0,1) with 
probability density function f(z). Let zo and z\j be the smallest and largest values of 0 
in the population. We obtain stratum boundaries, Zi, z 2 , ■ ■ ■ ,Zu-i, by minimizing V(z) = 


n 'Yhh =i and ignoring the finite population correction factor (Dalenius and Hodges 


1959). In the previous equation, Wh = Nh/N is the stratum weight, and is the true 


variance of the stratum. Following this procedure, HDI was divided into 5 categories. When 
the ranges of the three auxiliary variables are combined 45 strata result. However, as federal 
schools tend to perform best when compared to other type of schools, and they represent 
only 0.6% of the schools in urban areas, we decided to define federal schools as a certainty 
strata, leading to 31 strata in total. 

The sample was obtained by first selecting nearly 20% of the schools that took part in the 
2006 edition of the OBMEP. From 2006 onwards, we checked the schools that took part in 
the subsequent editions and only kept in the sample those which participated in all editions 
until 2013. The final sample size comprises n = 2,463 schools. 


Standardization of the schools average score We propose a standardization of the 
average scores of the schools such that they are comparable across years. Let be the 
average score of school j within level i in year t, i — 1, 2, 3, j — 1, 2, ■ ■ • , rii, t = 1,2,--- ,8. 
Define W it = — J Wxjt as the average score of all schools within level i in year t. Now 
define Z l]t = where S it is the standard deviation of W ljt in year t and level i. As 

the average scores W ljt fall in the interval (0,120), we then compute minu = and 

1 See e.g. https://en.wikipedia.org/wiki/List_of_Brazilian_federative_units_by_Human_DevelopmentJndex 
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maXit = 120 Wit to obtain Y ljt = z%3t mm . lt , such that Y ijt e (0,1). This is the quantity 

on ^ maxn wizwjt J ' ' 

that will be considered as response variable in the models that will be fitted in Section |4} 


2.2 A brief literature review 

The use of the beta distribution to model rates and proportions as a function of covariates is 


relatively recent in the literature. Ferrari and Cribari-Neto (2004) proposed a beta regression 


model for rates and proportions. They provide closed-form expressions for the score function, 
for the Fisher’s information matrix and perform hypothesis testing of the coefficients using 
approximations based on the asymptotic normality of the maximum likelihood estimator. In 


particular, Ferrari and Cribari-Neto (2004), assume that if Y ~ feeto(/i, 0) then f(y | /i, 0) = 
_rM__ 2/ ^- 1 (i _ y )((i-^-b suc h that E(Y) = /i and Var(Y) = for y 6 (0,1), 

H G (0,1), and 0 > 0. They focus on the modelling of a transformation of /i as a function of 
covariates. 


Branscum et al. (2007) discuss beta regression from a Bayesian point of view. In their 


model, the mean depends on covariates through a logistic link function. They also propose a 


semiparametric beta regression, and model fitting is performed using WinBUGS (Lunn et al. 


2000 ). 


Da-Silva et al. (2011) develop a Bayesian dynamic beta regression model for time series 


of rates or proportions. They propose to approximate the posterior distribution of the state 
parameters through Bayesian linear estimation and Gaussian quadrature, avoiding the use 
of Markov chain Monte Carlos (MCMC) methods. 


Bayes et al. (2012) propose a beta rectangular regression model which allows more flex¬ 


ibility in the modelling of the tails and of the precision parameter when compared to the 
beta regression model. The inference procedure follows the Bayesian paradigm and they also 
use the software WinBUGS to obtain samples from the resultant posterior distribution of the 
model parameters. 

As the data described in Section [2] has a natural hierarchical structure, in the next 
section we propose a hierarchical dynamic beta regression model that naturally accounts for 
the different educational levels as well as the evolution in time of the observations. Also, we 
allow the precision parameter of the beta distribution to be a function of covariates, possibly 























different from the ones in the mean structure. 


3 Proposed model 


Let Y.ij t be the average score of school j = 1,2,--- , rij, within educational level i = 1, 2, • • • , /, 


in year t — 1, 2, • • ■ , T. As described in Section |2.1| the average scores of the schools were 
standardized within each year such that Yij t is a random variable defined in the interval 
(0,1). In particular, we assume 


(Yijt | f l /.jt.- t ) ~ (pijt) j 


follows a beta distribution with mean /i ijt , 0 < pij t < 1 that represents the average score of 


school j within level i in year t, and (f>ijt > 0 can be seen as a precision parameter (Ferrari and 


Cribari-Neto, 2004). In what follows we describe the proposed modelling of the components 


Pijt and (fiijt • 

Let Xj Jt be a /^-dimensional vector of covariates, and /3 it a p-dimensional vector of coef¬ 
ficients, (3 it = (/3ou, ■ ■ ■ , we assume that 

,ll3t ' - Y/ « —(la) 

(lb) 

(lc) 


l0 S ( i with 

f3 u = on + Vft, v it ~ N (0, V P i), t = 1, • • - , T, 
ct t = ct t _ 1 +tJt, ~ AT(0, W a ). 


The components and cj* are assumed mutually and internally independent, for all i and 


t. Note that the coefficients of the covariates in equation (la) vary with the educational 


level i, and year t. Also, a priori , the coefficients (3 it follow a hierarchical dynamic model 


(Gamerman and Migon 1993), in the sense that for each time t, and level i, (3 it has mean 


ot t which in turn evolves smoothly with time according to equation (lc). The parameter 


vector a. t = (a ot , • • • , a( p _iy) / is a p —dimensional vector, with each component representing 
the overall effect of the I th covariate on the logit of the mean /i^. The variances of the prior 
distribution of the coefficients, Vg*, also vary with the educational level i, such that Vpi is 
a p-dimensional diagonal matrix, with elements V^j m , m — 0,1, 2, • • • ,p — 1. And IF a is a 
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/^dimensional diagonal matrix, with each element of the diagonal representing the variance 
of the evolution in time of component an, l — 1, ■ ■ • ,p. 

For the precision parameter we assume 


log^yi = With 

Sit = + Vi a, Vi it ~ N( 0, V Si ),t = 1, ■ ■ ■ , T, 

7t = 7i-i + “ 11 . wit ~ JV(0, WV), 


(2a) 

(2b) 

(2c) 


where Q ij t is a g-dimensional vector containing the covariates that might affect the precision 
parameter and S it is a q— dimensional vector of coefficients that, a priori , also follow 
a hierarchical dynamic structure. Note that *y t is a g-dimensional vector, such that each 
component captures the overall mean of the respective component in da- As <pijt is a precision 


parameter, we use a negative sign in equation (2a) to ease interpretation of the coefficients 


6n (Smithson and Verkuilcn, 2006). The q —dimensional covariance matrix V$i is allowed to 


vary per level and is assumed to be diagonal, 1V 7 is also a diagonal matrix, implying prior 
independence among the components of S it and 7 t , respectively. Figure [l] depicts a directed 
acyclic graph of the proposed model. 



Figure 1: Directed acyclic graph of the hierarchical model proposed in equations (|Tj) and ([2]). 
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3.1 Prior specification and inference procedure 


Inference procedure is performed under the Bayesian paradigm. To complete model speci¬ 
fication and following equations ([l]) and ([2]), we are left to assign the prior distribution of 
the hypeparameters Vpi, V$i, for i = 1, 2, • • • , I, W a , W 7 , a 0 , and y 0 . We assume prior inde¬ 
pendence among the hyperparameters. For the variance parameters, we assign independent, 
inverse gamma prior distributions with infinite variance and prior mean fixed at some rea¬ 
sonable value, e.g. the maximum likelihood estimate based on independent fits for each year. 
For the components of «o and 7 0 we assign independent, zero mean normal distributions, 
with variance fixed at some reasonably large value. 

Let y be the vector comprising the average scores of the schools stacked across the 
different educational levels and years. And let © be the parameter vector comprising all the 
parameters and hyperparameters in equations 0 and (j2j). The likelihood function, /(y | ©), 
is given by 


m 


ijt) 


/(y I “) n n n r((j, ijt (f>i jt )r ((i - fj,ij t )<f>ijt) VlJt 




(1 Dijt ) 


[(1 1 ] 


where T(.) is the usual Gamma function. 

Following the Bayes’ theorem, the posterior distribution of ©, p(& | y), is proportional 
to the likelihood function times the prior distribution. As we assume independence among 
the hyperparameters, it follows that 

p(® | y) oc /(y | 0) jn \p((3 it | at t ,Vpi)p{a t \ a t -i ,W a )\ [p(S it \ 7 u v Si)p(lt I 


p -1 

n pw im ) P (W a m) 

772=0 


q -1 

p{Vsik) p(W 7 k) 

J Lfc=o 


•p(c *0 I mo,Co)p(7o I mo, Co), 


which does not have a closed analytical form. We make use of MCMC methods to obtain 
samples from the posterior distribution above. In particular we use a hybrid Gibbs sam¬ 
pler with some steps of the Metropolis-Hastings algorithm. The posterior full conditional 
distributions of f3 it and S it do not have a closed form, and are sampled using the Metropolis- 
Hastings algorithm. In particular, the MCMC algorithm is implemented using the JAGS 


software (Plummer et al. 2003). 
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4 Data Analysis 


In this Section we analyze the performance of the schools across the different editions of 
the OBMEP, from 2006 until 2013. Equations (jl]) and ([2]) propose the most general model 
specification for analyzing the data described in Section [2] We fit particular cases of the 
proposed model and use three model comparison criteria to choose the best model among 
those fitted. All fitted models assume the mean structure as a function of an intercept, 
and the following covariates: the school’s administrative level (ADM), with ADM=1 if the 
school is federal, and 0 otherwise, the standardized human development index of the munici¬ 
pality the school is located in (HDI), the presence of library (LIB), the presence of laboratory 
(LAB), and the standardized proportion of boys present in the second phase of the OBMEP 
(BOYS). Note that the HDI is a proxy to describe the social condition that schools located 
in the same municipality share. Therefore, = (1, ADM, HDI, LIB, LAB, BOYS), and 

Pit \Poit 1 Pliti P 2 iti Pbiti P^iti Pbit) ■ 

For the precision parameter we explore different models, varying from a constant precision 
parameter for each level, to different versions that assume the logarithm of the precision <j>ij t 
as a linear function of the number of students (denoted as nstudent ) who were present in the 
second phase of the OBMEP. This allows the precision parameter of the beta distribution 
to change with the number of students the school has, in each level, in the second phase of 


the OBMEP. The fitted models consider particular versions of equations (2a) and (2b); in 
particular, the following models are fitted: 


Ml Q ijt = 1, S it = S 0 , Vt = 1, 2, • • • , T and i = 1, 2, 3, q = 1; 

M2 Q' jt = (1 ,nstudentij t ) S it = (ho, hi)', Vt = 1, 2, • • • , T, and i — 1,2, 3, q — 2; 

M3 Q' ijt = (1 ,nstudenti jt ) S it = (5 0i , S u )', Vt = 1, 2, • • ■ ,T,q = 2; 

M4 Q' jt = (1 ,nstudentij t ), S it = (S ot ,S lt y, for i = 1,2,3, q = 2; 

M5 Q' jt = (1 ,nstudent ijt ), S it = (, S m ,S lit )', q = 2. 

Note that Ml assumes the precision fixed across different levels and years, M2 describes the 
logarithm of the precision as a linear function of an intercept and the number of students, 
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per school, present in the second phase of the OBMEP. Models M3 and M4 allow the co¬ 
efficients to vary by level or year, and M5 allows the coefficients to vary by level and year 
simultaneously, corresponding to the most general proposed model in equation ([2]). 

For each model we let the MCMC run for 35,000 iterations, used 5,000 as burn in and 
stored every 30th iteration. Convergence was checked using the diagnostic tools in the R 


package coda (Plummer et al. 2006). 


4.1 Model comparison 

In this Section we describe the different model comparison criteria used to compare the 
different fitted models. In particular we use the deviance information criterion proposed by 


(Spiegelhalter et ah, 2002), and two other criteria based on proper scoring rules. 


Deviance Information Criterion ( DIC ) The DIC is a generalization of the AIC based 


on the posterior distribution of the deviance, _D(@) = —2 log p{y | ©) (Spiegelhalter et al. 


2002). More formally, the DIC is defined as 


DIC = D + p D = 2D - D{&), 

where D defines the posterior expectation of the deviance, D = Eq^(D), p D is the effective 
number of parameters, with pr> = D — D(@), and © represents the posterior mean of the 
parameters. D might be seen as a goodness of fit measurement, whereas po indicates the 
complexity of the model. Smaller values of DIC indicate better fitting models. 


Scoring rules Gneiting and Raftery (2007) consider proper scoring rules for assessing the 


quality of probabilistic forecasts. Following Gschlofil and Czado (2008), we use the same 


data for estimation and computation of the scores, as our focus is on understanding the 
relationship between the schools’ performance and the covariates other than prediction. We 
use two different scoring rules: 

Ranked probability score ( RPS ) For each the RPS can be expressed as 


RPS(y iit ) = E\y% - V ijt | - -E\y% - y£f|, 
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where y^ is the observed average score of the j th school within level i in year t, y [jf and 
y[Jf are independent replicates from the posterior predictive distribution of the respective 
model. 

Assuming there is a sample of size L from the posterior distribution of the parameters in 
the model, we can obtain roughly independent replicates, y[jf and y[Jf, from the respective 
posterior predictive distribution. The components E |y”f — y l3t and E |y [JjP — y^f | can be ap¬ 
proximated using Monte Carlo integration through j- lUijt ’ ~Uijt\ an d y i Wijf ' ~ 

-rep | i 

Vijt I. and 

1 / n; T 

RPS = ^EEE RPS{y ijt ), 

1 i =i j =i t=i 

where n is the total number of schools across all the years and educational levels. Smaller 
values of RPS indicate the best model among the fitted ones. 

Logarithmic score ( LogS ) The logarithmic score is defined as — log p(yijt), where p(yijt) 
is the probability density function at the observed average score of school j in the i th level 
and year t. Considering the observed sample y, LogS is computed as 

, I Hi T 

LogS = — EEE -log p(yijt), 

1 i =i j =i t =i 

where n is the total number of schools across all the years and educational levels. Smaller 
values of LogS indicate the best model among the fitted ones. 

Assuming there is a sample from the posterior distribution of the parameters of size L 
available, the predictive distribution p(yijt) is approximated using Monte Carlo integration, 
that is, 

p(yijt) = I PyiVijt I ©M© I y)d& » T^PyiVijt | © (0 ), 

'-'© ^ i=i 

where p y {yijt \ © (d ) is the probability density function of the beta distribution conditioned 
on the I th sampled value of the parameter vector ©, evaluated at y^t- 

Table [2] shows the values of the different model comparison criteria obtained under each 
fitted model. Although the values of LogS are quite similar across the different models, M5 
performs slightly better. Model M5 performs best under DIC and RPS. The results we 
show next are based on those obtained under model M5. 
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Model 

D 

Pd 

DIC 

RPS 

LogS 

Ml 

-56247.15 

167.36 

-56079.79 

0.03653 

-1.43 

M2 

-57126.82 

155.48 

-56971.33 

0.03637 

-1.45 

M3 

-57322.07 

177.92 

-57156.78 

0.03635 

-1.45 

M4 

-57471.56 

170.94 

-57300.61 

0.03620 

-1.46 

M5 

-58031.94 

209.54 

-5782240 

0.03610 

-I .47 


Table 2: Model comparison criteria, DIC and its components (po and D), RPS, and LogS, 
under each fitted model. Numbers in italics indicate best model under the respective crite¬ 
rion. 


Fitting a normal model to logit i/ijt The standardization of the observations to the 
interval (0,1) turns it possible a comparison of the scores of the schools across the different 
years and levels. An alternative to the beta hierarchical regression model is to fit a normal 
hierarchical model to y* jt = logit y^. Basically, this assumes that y *- t r^j N (Vijt,Vijt), with 
fiijt = 'X! i j t (3 it , and log (pij t = Q'^Su, where The evolution in time of parameters 


ijt 


13 it and 6u follow the same dynamic structure as in equations (lb) and (2b), respectively. 
This model was fitted to the data following similar prior specifications as those used when 
fitting model M5 under the beta response structure. The values of the different model 
comparison criteria were the following: DIC = 48026.54, RPS = 0.4547, and logS = 1.21, all 
of them much greater than the respective values obtained under model M5, suggesting that 
it is better to fit the beta hierarchical regression model which considers the original scale of 
the observations. Another interesting comparison is to look at the fitted values under model 
M5 and this normal counterpart. To make this comparison we first obtained samples from 
the posterior predictive distribution under the beta hierarchical model M5 and from the 
normal hierarchical model. Then, under the normal model, we transformed the fitted values 
back to the original scale to be able to compare the results from the normal model with 
those from M5. From panels of Figure [2] it is clear that both models provide similar values 
of the means of the posterior predictive distribution. However, the beta hierarchical model 
provides ranges of the 95% posterior predictive interval which are, in general, narrower than 
the ones obtained under the normal model. Similar results were obtained for other schools 
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in the sample. 


Federal State Municipal 




2006 2007 2008 2009 2010 2011 2012 2013 


Year 




Figure 2: Posterior summary of the predictive distribution of three different schools according 
to their administrative (columns) and educational levels (rows), together with the observed 
values (open circles). The gray shaded area is the 95% posterior predictive credible interval 
for ijijt = 1+exp (yl ) , the gray dashed line is the mean of the posterior predictive distribution 
of y^t. The black solid line is the posterior mean of the predictive distribution, and the 
dotted-dashed lines are the limits of the 95% posterior predictive credible interval under the 
beta hierarchical model M5. 

Next we focus on the description of the posterior distribution of the parameters in model 
M5 to better understand the effect of covariates on the performance of schools across years 
and levels. 

The intercept per level, /3 0it , suggests some cyclical pattern in the performance of the 
schools across the levels and years. For level 1 this pattern is smoother than for levels 2 
and 3. In particular, in 2009, levels 2 and 3 show a drop in the performance, followed by 
an increase in 2010, suggesting that the exam in 2009 resulted in the lowest scores among 
the editions of the OBMEP considered in the sample. After 2010, the scores show a slight 
increase for levels 2 and 3 (first row of Figure |3|. Indeed, most of the schools commented 
with the organizers of the OBMEP that the exam in 2009 was too difficult when compared 
to previous years. 


16 
























As expected, regardless of the level considered, an increase in the value of the HDI impacts 
positively the logit of the mean performance of a school in the second phase of the OBMEP. 
The estimated overall effect of the HDI does not show any particular pattern across the 
years, suggesting a constant positive effect across years (second row of Figure [3]). This is 
expected as we are using the value of the HDI in 2010 for all years. 

Regarding the administrative level, federal schools perform considerably better than state 
or municipal ones. The evolution of the effect of the administrative level across the years 
seems to be relatively similar for levels 2 and 3, with both being slightly different from level 
1, especially after 2010. The overall effect of the administrative level shows a slight increase 
after 2010 (third row of Figure |3j) . Also, the effect of the administrative level of the school 
tends to be smaller for level 3 than for levels 1 and 2, especially after 2010. 

The presence of a laboratory does not seem to affect the logit of the overall mean for level 
3, whereas for levels 1 and 2 it has a small, positive effect until 2009 and 2007, respectively, 
with 0 falling within the 95% credible interval after these years. A similar behaviour is 
observed for the coefficient of the presence of a library in the school (4 th and 5 th rows of 
Figure [3]). 

Although the 95% posterior credible interval of the overall effect of the proportion of boys 
in the second phase includes 0 for all years (6 th row and 4 th column of Figure [3]), interesting 
characteristics of the effect of this covariate are observed when we disentangle the effect per 
level. For levels 2 and 3, an increase in the proportion of boys result in an increase in the 
respective logit of the mean score of the school. On the other hand, the proportion of boys in 
the second phase does not seem to affect the logit of the mean score of level 1, being strictly 
positive only in 2007. 

The posterior summary of the coefficients related to the modelling of the precision pa¬ 
rameter confirm the need of allowing different values across years and levels (see panels of 
Figure [4]). An increase in the number of students present in the second phase of OBMEP 
result in an increase in the respective precision parameter. Also, these effects vary across 
levels and years. 

Panels of Figure [5] show the posterior summary of the variances of the coefficients, and 
the respective variances of the evolution equation, W am and W lk (m = 0,1, • • • ,5, and 
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Figure 3: Posterior summary (mean: solid line, and limits of the 95% credible intervals: 
shaded area) of the coefficients /% t , for the intercept, HDI, ADM, LAB, LIB, and BOYS 
(rows) by level (columns 1, 2 and 3), and respective overall effect, ai t (4 th column), l = 
0 , 1 , 2 , 3 , 4 , 5 . 


k = 0,1), for the coefficients in the logit of the mean (first row), and in the log of the 


precision parameters (second row) (see equations ( fib]) , ( flcf ) , ( J2b[ ) , and (|2c[)). For both cases, 

-, under 


the respective intercepts result in the highest values of the variances. 


Panels of Figure 


show the posterior summary of the mean, 


exp(E*=o/3« x Ut 


(l+exp(Ei=o 

some particular scenarios, for the different educational levels and last observed year, t = 8 
(year 2013). Panels in the first two columns of Figure [6] show the behaviour of the mean 
as a function of different values of the (standardized) HDI, considering low and high values 
of the standardized proportion of boys in the second phase, for federal, state or municipal 
schools, and those with laboratory (LAB=1), and library (LIB=1). Clearly, regardless of 
the educational level (different rows) considered, federal schools show a steeper increase on 
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Level 1 Level 2 Level 3 Overall Effect 



2006 ' 2008 ' 2010 ' 2012 ' 2006 ' 2008 ' 2010 ' 2012 ' 2006 ' 2008 ' 2010 ' 2012 ' 2006 ' 2008 ' 2010 ' 2012 ' 
Year Year Year Year 


Figure 4: Posterior summary (mean: solid line, and limits of the 95% credible intervals: 
shaded area) of the intercept and the coefficient of nstudent for the precision parameter, 
dmit, together with the respective overall effect, 7 mt , m = 0,1 ( A th column). 
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Figure 5: Posterior summary (mean: solid circle, and limits of the 95% credible intervals: 
solid lines) of the variances (W am and W 7 *,, m — 0,1, • • • ,5, and k — 0,1) of the coefficients 
of the covariates in the logit of the mean (first row) and log of the precision (second row) 


parameters. See equations (lb), (lc), (2b), and (2c) 


the mean as the HD1 increases, when compared to state or municipal ones. This leads to 
a greater difference in mean performance between federal and state or municipal schools 
located in municipalities with high values of the HD1. 

Panels in the last two columns of Figure [ 6 ] show the behaviour of the mean as a function 
of different values of the (standardized) proportion of boys, for low and high values of the 
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standardized HDI, federal, state or municipal schools, with laboratory (LAB=1), and library 
(LIB=1). In the first educational level (first row and columns 3 and 4) as the proportion of 
boys increases the mean tends to be constant, independent of the value of the HDI. Again, 
the effect of the HDI is very clear: regardless of the administrative level, schools located 
in municipalities with high values of HDI perform better than those in locations with low 
values of the HDI. Also, it is clear that there is a greater difference in mean performance 
between federal and state or municipal schools in locations with high value of the HDI, when 
compared to those with low value of the HDI. For the second and third educational levels 
(second and third rows and columns 3 and 4) it can be noticed a smooth increase of the 
mean as the proportion of boys increases. 


oo 

o 


Low Prop. Boys 



High Prop. Boys 



oq 

ci 




Low HDI 


High HDI 


co 

o 




- Federal 

— State or Mun. 
95% Cl 


-3 ' -1 ' 1 2 
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-3 ' -1 1 2 

Boys 


Figure 6: Posterior summary (mean: solid (federal schools) and dashed (state or municipal 
schools) lines, and respective limits of the 95% credible intervals: shaded areas) of the mean, 
7 — P (^A.°^!/ ‘ Jt ~ \ , for the different educational levels 1, 2, and 3 (rows), year t — 8 (year 

(l+exp(£i=o PitXijt)) ' 

2013), and different values of x' Jt =(l,ADM=l or 0,HDI,Lab=l,Lib=l,Boys). 
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Our model allows to compare the performance of schools according to their administrative 
and educational levels. Panels of Figure [7] show the posterior mean of /i lJ t of each school in 
the sample, grouped by its educational and administrative levels. Clearly, independent of 
the educational level considered, federal schools have mean performance greater than state 
and municipal ones. However, for level 3, the difference in performance between federal and 
state schools seem to be smaller than this difference for levels 1 and 2. Also, the posterior 
mean of federal schools in level 3 (first column and third row) show more variability than 
in levels 1 and 2. State and municipal schools have very similar performance, with scores 
varying below 0.3 for most of the editions. It is noticeable that all schools presented a drop 
in performance in the editions of 2008 and 2009. 


Federal 

Ajk 


State 


Municipal 



KM KM 



AM 


2006 2008 2010 2012 

Year 



2006 2008 2010 2012 

Year 


Figure 7: Posterior mean of the mean score, /iy t , for each school in the sample, per year, 
grouped by educational (rows) and administrative (columns) levels. 


5 Discussion 

This paper analyses the performance of schools that took part in eight editions of the second 
phase of the OBMEP, from 2006 until 2013. As different exams are given in different years, 
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we propose an ad hoc standardization, per level and year, of the schools’ mean scores such 
that they lie in the interval (0,1). It would have been better if organizers of the OBMEP 
used some tool to standardize the level of difficulty of the exams across years. This is an 
issue that should be tackled in the next editions of the OBMEP. 

A hierarchical dynamic beta regression model is proposed to investigate the importance 
of some covariates in explaining the performance of schools in different educational levels. 
We allow the coefficients of the covariates to vary per level and year. We also explore models 
that allow the precision parameter of the beta distribution to be a function of the number 
of students in the school present in the second phase of the OBMEP. Inference procedure 
is performed under the Bayesian paradigm and uncertainty about parameters’ estimation 
is obtained in a straightforward fashion. A sample from the posterior distribution of the 
parameters was obtained through MCMC. In particular we used the Gibbs sampler with 
some steps of the Metropolis-Hastings algorithms. Implementation of the MCMC algorithm 
was done using the software JAGS. Model comparison criteria suggest model M5, which 
assumes the logarithm of the precision parameter as a function of the number of students in 
the second phase of the OBMEP, with effects varying per level and year, perform best when 
compared to simpler versions of the proposed model. 

Important conclusions are drawn from this study. Overall, the mean performance of 
schools tend to be low, with scores varying below 0.3 for state and municipal schools in all 
three educational levels. Results show that, in general, federal schools perform better than 
state or municipal ones. However, the posterior mean of federal schools in level 3 show more 
variability than in levels 1 and 2 (Figure [7]). The difference between federal, and state and 
municipal schools tend to be greater in the first and second levels of the OBMEP. Federal 
schools show a steeper increase in their mean performance, as a function of the HDI, than 
state or municipal ones. As in Brazil the HDI is highly correlated with the geographical 
region, federal schools in the south and south-east (with higher values of HDI) regions of the 
country tend to perform better than federal schools in the north or north-east (with lower 
values of HDI) regions of Brazil. Also, state or municipal schools in locations with higher 
values of HDI tend to have mean performance closer to those of federal schools in locations 
with lower values of HDI, regardless of the proportion of boys the school has in the second 
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phase of the OBMEP (first two columns of Figure [6]). 

Schools with greater proportion of boys in the second phase of the OBMEP tend to 
perform slightly better in the second and third levels, with this covariate having no effect 
on the mean of the scores for the first level (last two columns of Figure [6]). The possible 
difference in performance of boys and girls in mathematics exams has been the object of 


interest in different studies (Hyde and Mertz, 2009). The analysis of the results of PISA 
2009 show that boys outperformed girls in mathematics in 35 out of the 65 countries and 
economies that took part in PISA 2009. On the other hand, for 25 countries no significant 
difference was observed between the genders, whereas for 5 countries girls outperformed boys 


in the mathematics exam of PISA 2009 (OECD, 2011). Why the proportion of boys in the 
school affects positively the average scores in levels 2 and 3 of the OBMEP, when the students 
are slightly older and with a better understanding about their interests, clearly needs deeper 
investigation and understanding. 

Our current interest is to investigate what kind of impact the OBMEP has on the Brazil¬ 


ian educational system. Biondi et ah (2012) quantify the effects of the 2007 edition of the 
OBMEP on the average math scores of the ninth-graders participating in Prova Brasil, which 
is a national exam applied by INEP to all Brazilian students in the 8 th and 9 th grades of 
publich schools. We plan to focus on students in the last year of high school. Considering 


different years we plan to use causal inference and propensity score methods (Hirano and 


Imbens|, 2004) to investigate the effect of the OBMEP on the performance of students in 


different editions of the High School Brazilian National Exam (Exame Nacional do Ensino 
Medio , ENEM). Every year, results of the ENEM are used by nearly 500 universities in 
Brazil as a selection criterion for admission to higher education. 


Acknowledgements 

This work is part of Moraes’ M.Sc. Dissertartion under the supervision of A.M. Schmidt 
and H. S. Migon. We are grateful for financial support from CNPq and FAPERJ (Schmidt), 
CAPES (scholarship to Moraes) and CNPq (Migon). The authors thank Professor Claudio 
Landim (IMPA, Brazil), Monica Souza (OBMEP) and Luiz L. R. da Conceicao (OBMEP) 


23 














for introducing the problem and making the dataset available. 


References 

Bayes, C. L., Bazan, J. L. and Garcia, C. (2012) A new robust regression model for propor¬ 
tions. Bayesian Analysis , 7, 841-866. 

Biondi, R. L., Vasconcellos, L. and Menezes-Filho, N. (2012) Evaluating the impact of the 
Brazilian Public School Math Olympics on the quality of education. Economia, 12, 143- 
175. 

Branscnm, A. J., Johnson, W. O. and Thurmond, M. C. (2007) Bayesian beta regression: 
applications to household expenditure data and genetic distance between foot-and-mouth 
disease viruses. Australian New Zealand Journal of Statistics, 49, 287-301. 

Da-Silva, C. Q., Migon, H. S. and Correia, L. T. (2011) Dynamic Bayesian beta models. 
Computational Statistics & Data Analysis, 55, 2074-2089. 

Dalenius, T. and Hodges, J. L. J. (1959) Minimum variance stratification. Journal of the 
American Statistical Association, 54, 88-101. 

Ferrari, S. and Cribari-Neto, F. (2004) Beta regression for modelling rates and proportions. 
Journal of Applied Statistics, 31, 799-815. 

Gamerman, D. and Migon, H. S. (1993) Dynamic hierarchical models. Journal of the Royal 
Statistical Society, Series B, 55, 629-642. 

Gneiting, T. and Raftery, A. E. (2007) Strictly proper scoring rules, prediction and estima¬ 
tion. Journal of the American Statistical Association, 102, 359-378. 

Gschlofil, S. and Czado, C. (2008) Modelling count data with overdispersion and spatial 
effects. Statistical Papers, 49, 531-552. 

Hirano, K. and Imbens, G. W. (2004) The propensity score with continuous treatments. In 
Applied Bayesian modeling and causal inference from incomplete-data perspectives, 73-84. 
Gelrnan, A. and Meng, X. L.(eds). Wiley, Oxford, U.K. 


24 



Hyde, J. S. and Mertz, J. E. a. (2009) Gender, culture, and mathematics performance. 
Proceedings of the National Academy of Sciences of the United States of America , 106, 
8801-8807. 

Lunn, D. J., Thomas, A., Best, N. and Spiegelhalter, D. (2000) WinBUGS - a Bayesian 
modelling framework: concepts, structure, and extensibility. Statistics and computing , 10, 
325-337. 

OECD (2011) How do girls compare to boys in mathematics skills? In PISA 2009 
at a Glance. OECD Publishing, Paris, France. URLhttp://dx. doi . org/10.1787/ 
9789264095250-8-en. 

(2012) Education at a Glance 2012: OECD Indicators. URL/content/book/ 

9789264095298-en. 

Plummer, M., Best, N., Cowles, K. and Vines, K. (2006) CODA: Convergence diagnosis and 
output analysis for MCMC. R News , 6, 7—11. URLhttp://CRAN. R-project. org/doc/ 
Rnews/. 

Plummer, M. et al. (2003) JAGS: A program for analysis of Bayesian graphical models using 
Gibbs sampling. In Proceedings of the 3rd international workshop on distributed statistical 
computing , vol. 124, 125. Vienna. 

Smithson, M. and Verkuilen, J. (2006) A better lemon squeezer? Maximum-likelihood re¬ 
gression with beta-distributed dependent variables. Psychological methods , 11, 54. 

Spiegelhalter, D., Best, N., Carlin, B. and Linde, A. (2002) Bayesian measures of model 
complexity and fit. Journal of the Royal Statistical Society, Series B, 64, 583-639. 

Thompson, S. K. (2012) Sampling. Wiley Series in Probability and Statistics, 3rd edition. 


25 


